Updating the rl/main.py yaml values to match the forge config lookup code. #104
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
$subject.
here is the working execution. It fixes the following two error.
(forge) [[email protected] ~/forge_fork (main)]$ python -m apps.rl.main --config apps/rl/llama3_8b.yaml
Traceback (most recent call last):
File "/home/pradeepfdo/.conda/envs/forge/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/pradeepfdo/.conda/envs/forge/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/home/pradeepfdo/forge_fork/apps/rl/main.py", line 63, in
sys.exit(recipe_main())
File "/home/pradeepfdo/.conda/envs/forge/lib/python3.10/site-packages/forge/cli/config.py", line 180, in wrapper
sys.exit(recipe_main(conf))
File "/home/pradeepfdo/forge_fork/apps/rl/main.py", line 59, in recipe_main
asyncio.run(run(cfg))
File "/home/pradeepfdo/.conda/envs/forge/lib/python3.10/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/home/pradeepfdo/.conda/envs/forge/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
return future.result()
File "/home/pradeepfdo/forge_fork/apps/rl/main.py", line 29, in run
trainer, buffer = await asyncio.gather(
File "/home/pradeepfdo/.conda/envs/forge/lib/python3.10/site-packages/forge/controller/proc_mesh.py", line 53, in spawn_actors
mesh = await get_proc_mesh(processes)
File "/home/pradeepfdo/.conda/envs/forge/lib/python3.10/site-packages/forge/controller/proc_mesh.py", line 77, in get_proc_mesh
if process_config.with_gpus:
File "/home/pradeepfdo/.conda/envs/forge/lib/python3.10/site-packages/omegaconf/dictconfig.py", line 355, in getattr
self._format_and_raise(
File "/home/pradeepfdo/.conda/envs/forge/lib/python3.10/site-packages/omegaconf/base.py", line 231, in _format_and_raise
format_and_raise(
File "/home/pradeepfdo/.conda/envs/forge/lib/python3.10/site-packages/omegaconf/_utils.py", line 899, in format_and_raise
_raise(ex, cause)
File "/home/pradeepfdo/.conda/envs/forge/lib/python3.10/site-packages/omegaconf/_utils.py", line 797, in _raise
raise ex.with_traceback(sys.exc_info()[2]) # set env var OC_CAUSE=1 for full trace
File "/home/pradeepfdo/.conda/envs/forge/lib/python3.10/site-packages/omegaconf/dictconfig.py", line 351, in getattr
return self._get_impl(
File "/home/pradeepfdo/.conda/envs/forge/lib/python3.10/site-packages/omegaconf/dictconfig.py", line 442, in _get_impl
node = self._get_child(
File "/home/pradeepfdo/.conda/envs/forge/lib/python3.10/site-packages/omegaconf/basecontainer.py", line 73, in _get_child
child = self._get_node(
File "/home/pradeepfdo/.conda/envs/forge/lib/python3.10/site-packages/omegaconf/dictconfig.py", line 480, in _get_node
raise ConfigKeyError(f"Missing key {key!s}")
omegaconf.errors.ConfigAttributeError: Missing key with_gpus
full_key: trainer.processes.with_gpus
object_type=dict